Skip to main content

Lab Workshop: Multiple Models as Judges to Construct Labeled Datasets

How can researchers and students assess the ground truth about performance of language models on specific tasks?

This workshop demonstrates how LLM-as-a-Judge can turn large, unstructured collections of business documents into structured datasets. Participants will explore new Quinlan-built language models that extract and label workplace skills and tasks from natural language, review sample code and real-world use cases, and learn how to interpret outputs and compile summary reports using multiple LLM-as-a-Judge assessments of ground truth.

Event info:

Date: Wednesday, October 29, 2025
Time: Noon-1:00 p.m. (CT)
Format: Online

Speakers:

Research Paper: Extracting O*NET Features from the NLx Corpus to Build Public Use Aggregate Labor Market Data

How can researchers and students assess the ground truth about performance of language models on specific tasks?

This workshop demonstrates how LLM-as-a-Judge can turn large, unstructured collections of business documents into structured datasets. Participants will explore new Quinlan-built language models that extract and label workplace skills and tasks from natural language, review sample code and real-world use cases, and learn how to interpret outputs and compile summary reports using multiple LLM-as-a-Judge assessments of ground truth.

Event info:

Date: Wednesday, October 29, 2025
Time: Noon-1:00 p.m. (CT)
Format: Online

Speakers:

Research Paper: Extracting O*NET Features from the NLx Corpus to Build Public Use Aggregate Labor Market Data